Better Huffman Coding via Genetic Algorithm

نویسندگان

  • Cody Boisclair
  • Markus Wagner
چکیده

We present an approach to compress arbitrary files using a Huffman-like prefix-free code generated through the use of a genetic algorithm, thus requiring no prior knowledge of substring frequencies in the original file. This approach also enables multiple-character substrings to be encoded. We demonstrate, through testing on various different formats of real-world data, that in some domains, there is some significant advantage to using this genetic approach over the traditional Huffman algorithm and other existing compression methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the random property of compressed data via Huffman coding

Though Huffman codes [2,3,4,5,9] have shown their power in data compression, there are still some issues that are not noticed. In the present paper, we address the issue on the random property of compressed data via Huffman coding. Randomized computation is the only known method for many notoriously difficult #P-complete problems such as permanent, and some network reliability problems, etc [1,...

متن کامل

Evaluation of Huffman and Arithmetic Algorithms for Multimedia Compression Standards

Compression is a technique to reduce the quantity of data without excessively reducing the quality of the multimedia data.The transition and storing of compressed multimedia data is much faster and more efficient than original uncompressed multimedia data. There are various techniques and standards for multimedia data compression, especially for image compression such as the JPEG and JPEG2000 s...

متن کامل

Data Compression Considering Text Files

Lossless text data compression is an important field as it significantly reduces storage requirement and communication cost. In this work, the focus is directed mainly to different file compression coding techniques and comparisons between them. Some memory efficient encoding schemes are analyzed and implemented in this work. They are: Shannon Fano Coding, Huffman Coding, Repeated Huffman Codin...

متن کامل

Twenty (or so) Questions: Bounded-Length Huffman Coding

The game of Twenty Questions has long been used to illustrate binary source coding. Recently, a physical device has been developed which mimics the process of playing Twenty Questions, with the device supplying the questions and the user providing the answers. However, this game differs from Twenty Questions in two ways: Answers need not be only “yes” and “no,” and the device continues to ask q...

متن کامل

Constructing Binary Huffman Tree1

Huffman coding is one of a most famous entropy encoding methods for lossless data compression [16]. JPEG and ZIP formats employ variants of Huffman encoding as lossless compression algorithms. Huffman coding is a bijective map from source letters into leaves of the Huffman tree constructed by the algorithm. In this article we formalize an algorithm constructing a binary code tree, Huffman tree.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008